AITopics | latent feature

Videos are Sample-Efficient Supervisions: Behavior Cloning from Videos via Latent Representations

Neural Information Processing SystemsJun-23-2026, 00:38:44 GMT

Humans can efficiently extract knowledge and learn skills from the videos within only a few trials and errors. However, it poses a big challenge to replicate this learning process for autonomous agents, due to the complexity of visual input, the absence of action or reward signals, and the limitations of interaction steps. In this paper, we propose a novel, unsupervised, and sample-efficient framework to achieve imitation learning from videos (ILV), named Behavior Cloning from Videos via Latent Representations (BCV-LR). BCV-LR extracts action-related latent features from high-dimensional video inputs through self-supervised tasks, and then leverages a dynamics-based unsupervised objective to predict latent actions between consecutive frames. The pre-trained latent actions are fine-tuned and efficiently aligned to the real action space online (with collected interactions) for policy behavior cloning. The cloned policy in turn enriches the agent experience for further latent action finetuning, resulting in an iterative policy improvement that is highly sample-efficient. We conduct extensive experiments on a set of challenging visual tasks, including both discrete control and continuous control. BCV-LR enables effective (even expert-level on some tasks) policy performance with only a few interactions, surpassing state-of-the-art ILV baselines and reinforcement learning methods (provided with environmental rewards) in terms of sample efficiency across 24/28 tasks. To the best of our knowledge, this work for the first time demonstrates that videos can support extremely sample-efficient visual policy learning, without the need to access any other expert supervision.

bcv-lr, machine learning, reinforcement learning, (17 more...)

Neural Information Processing Systems

Genre: Research Report > Experimental Study (1.00)

Industry:

Leisure & Entertainment > Games > Computer Games (0.46)
Education > Educational Setting > Online (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

FlexVAR: Flexible Visual Autoregressive Modeling without Residual Prediction

Neural Information Processing SystemsJun-14-2026, 16:42:29 GMT

This work challenges the residual prediction paradigm in visual autoregressive modeling and presents FlexVAR, a new Flexible Visual AutoRegressive image generation paradigm. FlexVAR facilitates autoregressive learning with groundtruth prediction, enabling each step to independently produce plausible images. This simple, intuitive approach swiftly learns visual distributions and makes the generation process more flexible and adaptable.

arxiv preprint arxiv, large language model, machine learning, (19 more...)

Neural Information Processing Systems

Genre:

Research Report > Experimental Study (1.00)
Workflow (0.93)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

+39+26+56+67+20+15+22Coarse-grainedobjectFine-grainedobjectTexturePathologyUltrasounddatasetexpansionAuto-createddatawithnewinformationSmalldatasetExpandeddatasetcatdog

Neural Information Processing SystemsApr-30-2026, 06:41:05 GMT

The power of DNNs relies heavily on the quantity and quality of training data. However, collecting and annotating data on a large scale is often expensive and timeconsuming. To address this issue, we explore a new task, termed dataset expansion, aimed at expanding a ready-to-use small dataset by automatically creating new labeled samples. To this end, we present a Guided Imagination Framework (GIF) that leverages cutting-edge generative models like DALL-E2 and Stable Diffusion (SD) to "imagine" and create informative new data from the input seed data. Specifically, GIF conducts data imagination by optimizing the latent features of the seed data in the semantically meaningful space of the prior model, resulting in the creation of photo-realistic images with new content. To guide the imagination towards creating informative samples for model training, we introduce two key criteria, i.e., class-maintained information boosting and sample diversity promotion. These criteria are verified to be essential for effective dataset expansion: GIF-SD obtains 13.5% higher model accuracy on natural image datasets than unguided expansion with SD. With these essential criteria, GIF successfully expands small datasets in various scenarios, boosting model accuracy by 36.9% on average over six natural image datasets and by 13.5% on average over three medical datasets.

artificial intelligence, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country: Asia (0.28)

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine > Diagnostic Medicine (0.47)
Information Technology > Security & Privacy (0.45)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Add feedback

1301962d8b7bd03fffaa27119aa7fc2b-Paper.pdf

Neural Information Processing SystemsApr-24-2026, 18:51:14 GMT

artificial intelligence, endogenous function, machine learning, (18 more...)

Neural Information Processing Systems

Country: Asia (0.28)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.70)
Information Technology > Data Science (0.68)
Information Technology > Communications (0.68)

Add feedback

Blind Regression: Nonparametric Regression for Latent Variable Models via Collaborative Filtering

Dogyoon Song, Christina E. Lee, Yihua Li, Devavrat Shah

Neural Information Processing SystemsMar-23-2026, 11:37:32 GMT

Neural Information Processing Systems http://nips.cc/

algorithm, artificial intelligence, machine learning, (18 more...)

Neural Information Processing Systems

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Coevolutionary Latent Feature Processes for Continuous-Time User-Item Interactions

Yichen Wang, Nan Du, Rakshit Trivedi, Le Song

Neural Information Processing SystemsMar-23-2026, 09:21:07 GMT

Matching users to the right items at the right time is a fundamental task in recommendation systems. As users interact with different items over time, users' and items' feature may evolve and co-evolve over time. Traditional models based on static latent features or discretizing time into epochs can become ineffective for capturing the fine-grained temporal dynamics in the user-item interactions. We propose a coevolutionary latent feature process model that accurately captures the coevolving nature of users' and items' feature. To learn parameters, we design an efficient convex optimization algorithm with a novel low rank space sharing constraints. Extensive experiments on diverse real-world datasets demonstrate significant improvements in user behavior prediction compared to state-of-the-arts.

artificial intelligence, data mining, machine learning, (19 more...)

Neural Information Processing Systems

Industry:

Media > Film (0.68)
Leisure & Entertainment (0.68)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Data Science > Data Mining (0.94)
(3 more...)

Add feedback

Reconstructing perceived faces from brain activations with deep adversarial neural decoding

Neural Information Processing SystemsMar-17-2026, 18:23:44 GMT

Here, we present a novel approach to solve the problem of reconstructing perceived stimuli from brain responses by combining probabilistic inference with deep learning. Our approach first inverts the linear transformation from latent features to brain responses with maximum a posteriori estimation and then inverts the nonlinear transformation from perceived stimuli to latent features with adversarial training of convolutional neural networks. We test our approach with a functional magnetic resonance imaging experiment and show that it can generate state-of-the-art reconstructions of perceived faces from brain activations.

artificial intelligence, machine learning, proceedings, (9 more...)

Neural Information Processing Systems

Industry: Health & Medicine > Diagnostic Medicine > Imaging (0.63)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.63)

Add feedback

Expanding Small-Scale Datasets with Guided Imagination

Neural Information Processing SystemsFeb-17-2026, 22:02:17 GMT

Does knowledge distillation really work?

artificial intelligence, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country:

North America > Canada (0.04)
Asia > Singapore (0.04)
Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)

Genre: Research Report (0.46)

Industry:

Information Technology (0.67)
Health & Medicine > Diagnostic Medicine > Imaging (0.30)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Add feedback

Expanding Small-Scale Datasets with Guided Imagination

Neural Information Processing SystemsFeb-17-2026, 22:02:14 GMT

The power of DNNs relies heavily on the quantity and quality of training data. However, collecting and annotating data on a large scale is often expensive and time-consuming.

artificial intelligence, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country: